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The newly emergent human coronavirus HKU1 (HCoV-HKUl) was first identified in Hong Kong in 2005. 
Infection by HCoV-HKUl occurs worldwide and causes syndromes such as the common cold, bronchitis, and 
pneumonia. The CoV main protease (M pro ), which is a key enzyme in viral replication via the proteolytic 
processing of the replicase polyproteins, has been recognized as an attractive target for rational drug design. 

In this study, we report the structure of HCoV-HKUl M pro in complex with a Michael acceptor, inhibitor N3. 

The structure of HCoV-HKUl provides a high-quality model for group 2A CoVs, which are distinct from group 
2B CoVs such as severe acute respiratory syndrome CoV. The structure, together with activity assays, supports 
the relative conservation at the PI position that was discovered by sequencing the HCoV-HKUl genome. 
Combined with structural data from other CoV M pro s, the HCoV-HKUl M pro structure reported here provides 
insights into both substrate preference and the design of antivirals targeting CoVs. 


Coronaviruses (CoVs) are positive-strand RNA viruses that 
have been identified as the main etiologic agents responsible 
for a vast number of enteric, gastric, and respiratory syndromes 
of both humans and animals (14-17, 19, 21, 23, 26, 30, 31, 34, 
45). CoVs can be divided into three groups: group 1 (including 
human CoV 229E [HCoV 229E] and transmissible gastric en¬ 
teritis virus [TGEV]), group 2 (including HCoV-OC43, murine 
hepatitis virus [MHV], and bovine CoV [BCoV), and group 3 
(including avian infectious bronchitis virus [IBV]). Shortly af¬ 
ter the emergence of severe acute respiratory syndrome CoV 
(SARS-CoV) in 2003, group 2 CoVs were further divided into 
two subgroups, termed 2A and 2B (46). The classical group 2 
viruses constitute subgroup 2A, while the newly emergent 
SARS-CoV and its animal counterparts (37) form subgroup 
2B. Group 1 and group 2 CoVs have more impact on human 
health than group 3, since group 3 CoVs (such as avian IBV) 
can only infect avian species. Following the outbreak of SARS, 
group 2 CoVs have continued to attract greater attention for 
two reasons. First, they consist of human viruses (SARS-CoV 
and HCoV-OC43) as well as several important animal viruses 
(MHV and BCoV) that serve as useful models for CoV-host 
interactions. Second, group 2 CoVs are reported to have 
crossed the animal-to-human species barrier in two instances: 
one bat-to-human transmission in group 2B (27, 37) and one 
transmission event in group 2A CoVs, in which BCoV led to 
the emergence of HCoV-OC43 (36). 

Group 2A HCoVs were less widely studied prior to the 
global SARS epidemic in 2003. However, they are closely as- 
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sociated with a wide range of acute or chronic respiratory 
syndromes (3, 4, 7-9, 11, 12, 15, 20, 22, 35, 39, 40, 47). In the 
wake of the SARS outbreak, several novel HCoVs have been 
discovered, one of which is HCoV-HKUl (9, 39). HCoV- 
HKUl has achieved global distribution since it was first iden¬ 
tified in 2005: infections were first characterized in Hong Kong 
(26), followed by the identification of several strains of the 
virus in Korea (9), Europe (5, 17), Australia (31), and North 
America (14). In contrast to the lethal SARS-CoV, infection by 
HCoV-HKUl usually leads to self-limiting syndromes affecting 
the lower respiratory tract. Nevertheless, the consequences 
could be more severe in patients with a compromised or im¬ 
mature immune system, such as asthma sufferers or newborn 
infants (24). Genome sequencing has confirmed that the 
HCoV-HKUl virus belongs to CoV group 2A and shares high 
sequence homology with MHV and BCoV (39). 

The functional components of the CoV replication machin¬ 
ery are released via posttranslational cleavage by two or three 
proteases. These proteases were first designated the papain¬ 
like protease (PLP) and 3C-like protease (3CL) for their re¬ 
spective sequence homology to the papain and rhinovirus 3C 
proteases. The 3CL protease also is commonly known as the 
main protease (M pro ) because of the major role it plays in the 
proteolytic pathway, which makes it the most attractive phar¬ 
macological target for anti-CoV drug design. CoV M pro s have 
been intensively studied, and crystal structures have been de¬ 
termined for the M pro s from the following CoVs: HCoV strain 
229E (HCoV-229E) (2), porcine TGEV (1), avian IBV (41), 
and SARS-CoV (44). These structures are representative of 
group 1 (HCoV-229E and TGEV), group 2B (SARS-CoV), 
and group 3 (IBV) CoVs. However, no structure of the M pro 
from a group 2A CoV (MHV, HCoV-HKUl, and HCoV- 
OC43) has been determined to date. The absence of structural 
data presents a major obstacle for structure-aided drug opti¬ 
mization targeting group 2A CoVs. 
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The M pro s from different CoV groups are homologous in 
both sequence and main-chain architecture. They share a sim¬ 
ilar substrate binding sequence, with a requirement for glu¬ 
tamine at the PI position and a strong preference for leucine/ 
methionine at P2. Based on this information, broad-spectrum 
lead compounds (43) with micromolar K t values have been 
designed that target CoV M pro s. However, structural data for 
the M pro s from classical group 2A Co Vs still are not available, 
posing a problem for further optimization. 

Although CoV M pro s exhibit absolute specificity for glu¬ 
tamine in the PI position, recent research (38) has shown that 
the M pro from HCoV-HKUl may possess an unusual substrate 
preference at PI site quite different from that of other CoV 
M pro s. Here, we report the structure of HCoV-HKUl M pro , 
which serves as a model for group 2A CoVs in complex with a 
synthetic peptidomimetic inhibitor, N3. The structure and sub¬ 
sequent enzyme activity assays help to resolve the issue of the 
relative conservation at the PI position based on genome se¬ 
quencing. Moreover, this complex structure provides further 
structural data for rational drug design against HCoVs. 

MATERIALS AND METHODS 

cDNA and plasmid. The cDNA encoding HCoV-HKUl M pro was kindly 
provided by K. Y. Yuen from the Department of Microbiology, Hong Kong 
University, Hong Kong Special Administrative Region, China. BamHI and Xhol 
restriction sites were attached to the 5' and 3' ends separately by PCR, and the 
PCR product first was inserted into the pMD-18T vector (Takara). The DNA of 
interest then was cleaved from the T vector and subcloned into a glutathione 
5-transferase-tagged expression vector, pGEX-4T-l. The validity of the whole 
procedure was confirmed by DNA sequencing. 

Protein expression and purification. The plasmid was first transformed into 
the commercial Escherichia coli strain BL21(DE3) Rosetta (Invitrogen). After 
incubation at 37°C overnight on an Amp + algae Luria-Bertani (LB) plate, fresh 
transformants were inoculated into 5 ml LB medium in the presence of 100 
pg/ml ampicillin. After growth for 12 h, the incubation system was scaled up to 
1 liter LB medium with the same concentration of antibiotics in a 2-liter flask, 
and the solution was shaken vigorously at 37°C until the optical density at 600 nm 
reached 0.6. Cells were induced by 0.5 mM isopropyl-(3-D-thiogalactopyranoside 
(Sigma) at 16°C overnight. 

Cell pellets were harvested by centrifugation, resuspended in 40 ml phosphate- 
buffered saline buffer with 2 mM dithiothreitol and 7 mM fJ-mercaptoethanol, 
and sonicated on ice for 25 min. The supernatant was collected after the cen¬ 
trifugation of the sonicant at 15,000 rpm for 40 min. 

Affinity purification was achieved by letting the supernatant flow through 2 ml 
glutathione 5-transferase affinity medium twice. On-column digestion lasted for 
16 h at 4°C with thrombin (New England BioLabs), and the protein of interest 
was harvested and concentrated to 30 mg/ml. The N3 inhibitor then was added 
to a final molar ratio of 1:1 and incubated at 4°C overnight. Finally, the HCoV- 
HKUl M pro -inhibitor complex was purified by gel filtration using a Superdex 200 
(10/30) column (GE Healthcare). The protein concentration was adjusted to 20 
mg/ml for crystallization trials. 

Crystallization and structure determination. Crystals of HCoV-HKUl M pro 
were grown in 0.1 M imidazole, pH 6.0, and 0.6 M sodium acetate by the hanging 
drop vapor diffusion method. Synchrotron X-ray diffraction data were collected 
on beamline BL-5A of the Photon Factory (Tsukuba, Japan) and processed to 
2.5-A resolution, using HKL2000 (29) for data indexing and scaling. Molecular 
replacement using the SARS M pro structure (Protein Data Bank entry 2AMQ; 
48% identity) as a template was performed with PHASER (32). The manual 
rebuilding of the structure was performed using Coot (13), and the structure was 
refined using REFMAC in the CCP4 suite (10). Final modification was carried 
out using CNS (6). The volume of the SI cavity was calculated using VOIDOO 
(25). 

Enzyme activity assays. Substrates and analogs were designed through three 
rounds of affinity optimization (42) by substrate mimicry and from a library of 
substrate analogs. The substrate and analogs were synthesized by Dawei Ma from 
the Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, 
China. 

The strategy employed for enzyme activity assays of HCoV-HKUl M pro has 


been described previously (43). Activity assays for HCoV-HKUl M pro against 
the CoV consensus substrate and the HCoV-HKUl-specific substrate followed a 
similar protocol, which is described briefly below. The consensus substrate and 
HCoV-HKUl-specific substrates were fluorescent compounds with the se¬ 
quences MCA-AVLQSGFR-Lys(Dnp)-Lys-NH 2 and MCA-PRLHCTTN- 
Lys(Dnp)-Lys-NH 2 , respectively (greater than 95% purity; GL Biotech Shanghai 
Ltd., Shanghai, China). A PI single-mutant substrate also was synthesized with 
sequence MCA-AVLHSGFR-Lys(Dnp)-Lys-NH 2 . 

The excitation and emission wavelengths of the fluorescent substrates were 
320 and 405 nm, respectively. A buffer consisting of 50 mM Tris-HCl (pH 7.3) 
and 1 mM EDTA was used for enzyme activity assays at 30°C. The reaction was 
initiated by adding protease (final concentration, 2 p,M) to a solution containing 
different final concentrations of the substrate (3.2 to 40 |jlM). Strict kinetic 
parameters for the inhibition assay were determined according to the previously 
reported protocol (43). All results from enzyme activity assays were calculated 
using data based on at least three independent parallel experiments. 

Coordinate accession number. Coordinates and structure factors for the 
HKU1 M pro in complex with inhibitor N3 have been deposited in the Protein 
Data Bank under entry ID 3D23. 

RESULTS 

Structural overview. Four protein molecules (denoted A, B, 
C, and D) occupy one asymmetric unit, with one N3 molecule 
per protomer. Two of the protomers form a typical ho¬ 
modimer, while the remaining two protomers dimerize with 
their adjacent symmetry-related counterparts (Fig. la). Each 
protomer exhibits a three-domain (I to III) architecture that is 
common to other CoV M pro structures (1, 2, 42, 44): domains 
I and II have chymotrypsin-like folds, and domain III displays 
a globular ct-helical cluster that is unique to CoV M pro . The 
catalytic site, including the Cys-His dyad, and the relatively 
shallow substrate binding pocket of HCoV-HKUl M pro are 
located in the cleft between domains I and II. The substrate¬ 
binding pocket features two deeply buried sites (PI and P2) 
and several sites with different levels of solvent exposure (P3, 
P4, and P5) (Fig. lb). X-ray data collection and refinement 
statistics are summarized in Table 1. 

Michael acceptor and catalytic dyad. Clear and continuous 
electron density was observed between the reactive backbone 
carbon atom of the N3 substrate and the Sy atom of Cysl45 in 
the inhibitor-bound HCoV-HKUl M pro structure. We con¬ 
clude that this reaction can be categorized as an electrophilic 
addition mediated by a Michael acceptor, obeying the K t - k 3 
kinetics, where K t is the dissociation constant and k 3 is the 
turnover number, according to the following scheme: 

K k 

- iV -m ''•cat 

E + S^ES -» E—S (1) 

As the covalently bound inhibitor is a mimic of the real peptide 
substrate, it is possible to model the transition state by treating 
the enzyme-inhibitor complex structure as a snapshot of the 
catalytic dyad, and hence to predict parameters of the K m — 
k cat kinetics, according to the following scheme: 

K, k 3 

E + I^EI^E - I (2) 

This catalytic dyad involves residues His41 and Cysl45, and the 
intermediate state might be stabilized by the oxyanion hole 
(28) formed by the backbone amides of the oxyanion loop from 
Phel40 to Cysl45 (Fig. 2a). The oxyanion hole is crucial to the 
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FIG. 1. (a) Structural overview of four protomers (A, green; B, cyan; C, magenta; and D, yellow) in one asymmetric unit, represented as 
cartoons. N3 inhibitors are shown as blue sticks, (b) Structural overview of the enzyme-inhibitor complex of one monomer unit. The main chain 
of the enzyme is represented as blue cartoons, and the synthetic inhibitor is shown as yellow sticks. The three domains are labeled. 



stabilization of the intermediate state, so the formation of the 
oxyanion hole has a significant influence on k cat . As discussed 
for the inhibitor design targeting human rhinovirus 3C pro¬ 
teases (28), the correct organization of this oxyanion loop also 


is essential to the k 3 step for mechanism-based suicide inhibi¬ 
tors. 

Surrounding the Sy atom of Cysl45, we observe well-defined 
amides from the loop from residues 142 to 145. Similarly to 
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TABLE 1. X-ray data-processing and refinement statistics 


Statistic 


Value for the HCoV-HKUl 
M pro -N3 complex 6 


Data collection 

Wavelength (A). 

Resolution limit (A) . 

Space group. 

Cell parameters 

a (A). 

b(A) . 

c (A). 

P O. 

Total no. of reflections. 

No. of unique reflections. 

Completeness (%). 

Redundancy. 

D a 

4 'merge . 

.1.0 

.50.0-2.48 (2.62-2.48) 

.P4i 

.91.770 

.91.770 

.187.914 

.90 

.510,408 

.108,501 

.99.1 (98.64) 

.5.0 (4.9) 

.0.11 

.0 

Mi) . 

.16 (5) 

Refinement 

Resolution range (A) . 

.50.0-2.5 

Rvork* (%). 

.22.9 

flfree (%). 

.28.5 

rmsd 17 from ideal geometry 

Bonds (A) . 

.0.012 

Angles (°). 

.1.59 

Avg B factor (A 2 ) 

Protein. 

.38.7 


.43.7 

Ramachandran plot^ 

Favored (%) . 

.86.8 

Allowed (%). 

.12.2 

Generously allowed (%). 

.0.9 

Disallowed (%). 

. 0.1 


a Emerge = 2 /,—</> /2 I , where Ii is the intensity of an individual reflection 
and </> is the average intensity of that reflection. 

b ^work = 2 F p — F c fX F p , where F c is the calculated and F p is the observed 
structure factor amplitude. 
c rmsd, root mean squares deviation. 

d Ramachandran plots were generated by using the program PROCHECK. 

6 Numbers in parentheses correspond to the highest-resolution shell. 


human rhinovirus 3C proteases (28), these amide dipoles con¬ 
struct a tetrahedral oxyanion hole. From native and complex 
structures of SARS-CoV M pro , the correct orientation of these 
backbone amides is triggered and maintained by substrate 
binding, in particular by the binding of the PI residue and 
interaction between the N finger and the substrate (2, 44), in 
which the backbone carbonyl of Leul41 is hydrogen bonded to 
the side chain oxygen of Serl44. The correct position of 
Leul41 is maintained by a hydrogen bond between the car¬ 
bonyl group of Phel40 and the amide group of the PI side 
chain and by hydrophobic stacking between Hisl63 and 
Phel40. Although the analysis of the HCoV-HKUl M pro struc¬ 
ture in complex with N3 (Fig. 2a) shows that the PI side chain 
exerts no direct influence on the residues forming the oxyanion 
hole, its side chain oxygen atom forms a strong hydrogen bond 
(2.6 A) with Hisl63 and helps to strengthen the stacking in¬ 
teraction with Phel40. Furthermore, the nitrogen atom of the 
PI side chain also forms a hydrogen bond (3.1 A) with the 
backbone of Phel40, thus helping to maintain the oxyanion 
loop (Phel40-Cysl45) in its proper conformation. For the 
above reasons, we conclude that the PI side chain is important 
for the network of interactions stabilizing the oxyanion hole. 


The SI pocket has a smaller size to accommodate PI histi¬ 
dine. Given its crucial role in the catalytic process, glutamine 
outperforms other residues as the signature of the M pro sub¬ 
strate at the PI position. In addition to this advantage, the side 
chain of glutamine in the PI position suitably fits with residues 
forming the SI subsite via Van der Waals interactions (Fig. 
2b). From the HCoV-HKUl genome sequence, 11 out of 12 
M pro recognition sites have Gin at the PI position. In our 
structure, the N3 molecule has a lactam ring as an analog to 
the glutamine residue (the cross-linking between the Cy and N 
atoms helps to select the stretching conformation from the 
ensemble of rotamers and better occupy the binding cleft). In 
HCoV-HKUl M pro structures, the lactam ring protrudes into 
the SI pocket via a hydrogen bond to the imidazole ring NH of 
Hisl62 at a distance of 2.6 A. However, unlike the SARS-CoV 
M pro structure in complex with N3, the NH of the HKU1-N3 
lactam ring fails to recruit a water molecule to satisfy a second 
SI hydrogen bond. Instead, the N-terminal Oy atom might 
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FIG. 2. (a) Details of the interaction between the PI side chain and 
the defined oxyanion loop, shown in stereo representation. Side chains 
are shown as sticks, and the crucial hydrogen bond between Hisl63 
and the substrate side chain is shown by a cyan dashed line, (b) Details 
of the substrate-binding pocket. The inhibitor is shown in the following 
color scheme: C, white; O, red; and N, blue. The crucial residues of the 
enzyme are shown in the following color scheme: C, cyan; O, red; N, 
blue; and S, yellow. Hydrogen bonds are shown as red dashed lines. 
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TABLE 2. Activity assay of HCoV-HKUl 


Substrate" 

K m (|iM) 

^-cat 

c 

83.2 ± 13.3 

1.1 ± 0.12 

H 

22.3 ± 5.2 

0.09 ± 0.013 

SMC 

265.3 ± 21.5 

0.03 ± 0.001 


a Substrate C refers to the consensus substrate of CoV M pro s with the se¬ 
quence MCA-AVLQSGFR-Lys(Dnp)-Lys-NH 2 . Substrate H refers to the spe¬ 
cific substrate of HCoV-HKUl, including a mutation at the PI site from glu¬ 
tamine to histidine, with the sequence MCA-PRLHCTTN-Lys(Dnp)-Lys-NH 2 . 
Substrate SMC refers to the consensus substrate, including a mutation at the PI 
site from glutamine to histidine, with the sequence MCA-AVLHSGFR- 
Lys(Dnp)-Lys-NH 2 . 

provide a weak electronegative interaction to stabilize the NH 
atom; the interaction likely is stronger due to the presence of 
redundant residues as a cloning artifact, hindering the N-ter- 
minal Ser from coming any closer to the NH of PI side chain. 

Nevertheless, compared with the M pro s from other CoV 
groups, the structure of the HCoV-HKUl M pro has an SI 
pocket with a relatively smaller volume of -—18.1 A 3 . In con¬ 
trast, the volume of the SI pocket of TGEV M pro is —19.1 A 3 , 
that of of IBV M pro is -21.7 A 3 , and that of SARS-CoV M pro 
is —19.5 A 3 . The reduced size of the SI pocket might be caused 
by the position of the loop Leul67-Cysl71, which is bent up¬ 
ward by about 90°. As a result, the smaller SI pocket might 
tolerate mutation to short-chain residues at the PI position, in 
which case a weakened oxyanion hole is to be expected. Novel 
substrate specificity already has been found in the HCoV- 
HKU1 genome, in which the M pro recognition site between the 
helicase and exonuclease utilizes histidine instead of glutamine 
at the PI position. Mimicking proteolysis in the cell, enzyme 
activity assays using a synthetic fluorogenic substrate confirm 
the existence of such a cleavage event in vitro and exhibit novel 
enzymatic properties not seen with the consensus substrate 
(Table 2). 

Enzyme activity assays indicate that the affinity for a sub¬ 
strate containing a single mutation at the PI position decreases 
to 30% of the affinity for the native consensus substrate, which 
can be attributed to the loss of a hydrogen bond resulting from 
the mutation of glutamine to histidine. The scissile velocity 
decreases to 3%, which is to be expected, since the histidine 
residue lacks an oxygen atom that is required to form a strong 
hydrogen bond and support the intermediate oxyanion hole. 
However, when determining the influence of P2-P5 variance in 
the HCoV-HKUl-specific substrate, we observed an unusual 
fourfold elevation in the K m compared to that of the consensus 
substrate and a minor rescue of the scissile velocity (a threefold 
elevation of the single-mutant substrate). This could be ex¬ 
plained by the contribution of the non-Pl residues in the 
HCoV-HKUl-specific substrate, since activity assays for the 
single-mutant substrate imply that mutation at the PI position 
has a detrimental effect not only on the substrate binding 
affinity but also on the substrate scissile velocity. 

The S2 pocket presents group-specific features but no 
group-specific substrate preferences. The P2 side chain of the 
ligand protrudes into the S2 pocket via interactions with the 
hydrophobic side chains of Met25, Pro52, and Tyr54 (Fig. 2B). 
The lid of the pocket is covered by a short 3 10 helical region 
from Ser45-Asn51. To compare the diversity of the S2 pockets 
of all three CoV groups, the backbones of M pro complex struc- 
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FIG. 3. SI and S2 binding sites of HCoV-HKUl M pro main chains 
of four M pro structures are superimposed and displayed in the neigh¬ 
borhood of the substrate-binding site. The SI and S2 binding sites are 
highlighted by light green shadows. The main chains are represented in 
worm forms. Different colors are used to represent the strain of CoV. 
Lemon, synthetic compound; magenta, HCoV-HKUl; light green, 
SARS-CoV; light blue, TGEV; and yellow, avian IBV. 


tures from all groups were superimposed (Fig. 3): for group 1, 
TGEV M pro in complex with the inhibitor Nl, an ancestor of 
N3; for group 3, IBV M pro in complex with N3; and for group 
2B, SARS-CoV in complex with inhibitor N3. We observed 
three modes of secondary structure: the 3 10 helix (HCoV- 
HKUl and SARS-CoV), a loose loop (IBV), and a tight loop 
(TGEV). Interestingly, the clustering of the secondary struc¬ 
ture correlates with the temporary classification of CoVs. We 
then explored the natural recognition sequences to examine 
whether the group-specific features could result in different 
substrate specificities at the P2 site (Table 3). After summa¬ 
rizing the P2 residue type in the protease recognition sites of 
the HKU1 PPlab genome, we observe that M pro s prefer a 
hydrophobic residue at this position, which is also the case for 
SARS-CoV, IBV, and TGEV. Although there are a few ex- 


TABLE 3. P2 residues in different CoV genomes 


M pro cleavage 
site no. 


P2 residue for: 


SARS-CoV 

HCoV-HKUl 

IBV 

TGEV 

i 

L 

L 

L 

L 

2 

F 

L 

L 

N 

3 

V 

I 

M 

V 

4 

L 

L 

L 

L 

5 

L 

M 

L 

L 

6 

L 

L 

L 

L 

7 

M 

V 

V 

M 

8 

L 

M 

L 

L 

9 

L 

L 

L 

L 

10 

L 

L 

L 

L 

11 

L 

M 

L 

L 
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FIG. 4. (a) Overview of the P3 pocket. The inhibitor resides between the interface of molecules B and D. A cyan surface model is shown 
covering protomers B and D. The inhibitor is shown in magenta, (b) P3 interaction site of substrate in detail. Neighboring residues within 4 A of 
the S3 site are colored green. An Fo-Fc map contoured at 1.5cr around the inhibitor is displayed in cyan. For the inhibitor, C atoms are colored 
yellow, N atoms are colored blue, and O atoms are colored red. The protein carbon atoms are colored gray. Neighboring main chains are displayed 
as white ribbons. 


ceptions, such as asparagine or valine residues, leucine/methi¬ 
onine are the most abundant. This is consistent with the ob¬ 
servation from our structure that the hydrophobic P2 side 
chain extends into the deep S2 site without clashing with the 
Van der Waals surface of the pocket (Fig. 2b). Therefore, on 
the one hand, considering the flexibility of S2 pocket as well as 
the residual space after occupation by the P2 residue, the 
optimal choice for leucine or methionine might be related to 
the size of the S2 pocket. On the other hand, the similar 
preferences on S2 sites among group 1, 2a, 2b, and 3 CoV 
M pro s does challenge the efficacy of designing group-specific 
inhibitors by altering only P2 moieties. 

The P3 position. Two out of four protomers in one asymmetric 
unit exhibit a solvent-exposed P3 side chain, which may interact 
weakly with the edge of the substrate-binding cleft via Van der 
Waals forces (Fig. 2b). To assess whether P3 side chain variation 
can influence the potency of inhibition, we synthesized a small 
library of six inhibitors (see Table SI in the supplemental mate¬ 
rial) to assay their affinity by the second-order reaction coeffi¬ 
cients. The results are summarized in Fig. SI in the supplemental 


material. It appears unlikely that the preference can be attributed 
exclusively to weak interactions. When investigating other possi¬ 
ble P3-related interactions found in our HCoV-HKUl M pro 
structure, we scrutinized and evaluated the mainly hydrophobic 
structural interaction (Fig. 4a) between molecules B and D (Fig. 
4b). From the crystal packing, we observed a crystallographic 
contact close to the P3 position (see Fig. S3 in the supplemental 
material) that we suspect might affect this property of the P3 
residue. To examine the physiological relevance, we conducted 
dynamic light-scattering experiments to check for higher-molec¬ 
ular-weight states that should be expected in solution if this in¬ 
teraction is related to one of the stable physiological states. How¬ 
ever, dynamic light-scattering experiments did not identify a 
tetramer or higher-molecular-weight complex in aqueous solution 
(see Fig. S2 in the supplemental material). Thus, the substrate 
selectivity at the P3 position may be attributed to other weak 
factors, such as solvent-side chain interactions and Van der Waals 
interactions with the substrate binding cleft, rather than to a 
strong and direct interaction, which is more likely to be influenced 
by crystal contacts. 
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FIG. 5. (a) Sequence alignment of a typical M pro from CoV group 2A exhibits high homology. The alignment was performed with ClustalW 
(33), and the final figure was generated with ESPriptl.O (18). White letters with red backgrounds refer to identical residues, red letters with white 
backgrounds refer to conservative variation, and black letters with white backgrounds refer to nonconservative mutations, (b) Three-dimensional 
representation of nonconserved mutations in group 2A CoV M pro s mapped onto the HCoV-HKUl M pro structure. Identical residues are colored 
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FIG. 6. Superposition of representative CoV M pro in complex with 
Michael acceptor-based inhibitors of group 1 (TGEV; blue), group 2A 
(HCoV-HKUl; green), group 2B (SARS-CoV; magenta), and group 3 
(IBV; red). The color of each inhibitor is consistent with that of its 
host. 


DISCUSSION 

The HKU1 structure is a suitable model for group 2A CoV 
M pro s. The crystal structure of M pro from HCoV-HKUl is the 
first to be determined from a group 2A M pro . Since members 
of the group 2A CoVs share particularly high sequence identity 
(4) (Fig. 5a), nonconservative changes occur mainly in flexible 
regions of the HCoV-HKUl M pro structure, including domain 
linkages and the molecular surface. Notable variations from 
residues 46 to 71 in group 2A sequences are located in or 
nearby the S2 pocket, which might infer properties relating to 
enzyme activity. However, since even greater differences be¬ 
tween the different groups of CoVs exhibit no particular en¬ 
zyme-specific preferences in the S2 pocket, the relatively small 
variations here may be unlikely to challenge the consensus 
substrate preference among CoVs at the P2 position. The 
residue ranges 20 to 40, 140 to 160, and 187 to 189, as well as 
residue 166, are highly conserved and are involved in the for¬ 
mation of the S-l and SI pockets, together with the walls of the 
binding pocket of the P3 side chain. Thus, it is reasonable to 
conclude that the HCoV-HKUl M pro structure is a suitable 
model for the study of group 2A CoVs, both in terms of 
enzyme activity and inhibitor design. 

Michael acceptor inhibitors interact with CoV M pro in a 
similar manner. As Michael acceptor suicide inhibitors, N3 
and its derivatives cocrystallize with the CoV M pro in a similar 
manner (Fig. 6). The backbones of the peptidomimetic com¬ 
pounds align antiparallel to the fl-strands, constituting the 
binding cleft. The PI and P2 residues fit into the SI and S2 
pockets, respectively, and have a major contribution to sub¬ 
strate preference: glutamine at PI and leucine/methionine at 
P2. In our future optimization of M pro inhibitors, we think that 
the glutamine (or its analog) might be worth keeping in the PI 
position, while it would be reasonable to conduct a comparison 


of leucine to methionine for the evaluation of the P2 residue. 
Aside from these deeply buried side chains, the solvent-ex- 
posed P3 provides no straightforward information for the sub¬ 
strate-enzyme interaction, though the variation at this position 
shows an obvious impact on inhibition. Alternatively, we might 
employ random screening for further optimization at the P3 
position. 

Conclusions. Structural data now are available for CoV 
M pr °-inhibitor complexes from all CoV groups, including the 
two subgroups of the group 2 CoVs. Moreover, these struc¬ 
tures provide further confirmation for the efficacy of wide- 
spectrum inhibitors at atomic resolution. From enzyme activity 
assays, we succeeded in identifying the atypical substrate spec¬ 
ificity of HCoV-HKUl M pro with higher affinity (K m ) and 
lower reactivity (& cat ) than those of the consensus CoV M pro 
substrate. We attributed these properties to the contribution of 
non-Pl residues and the distortion of the oxyanion hole. Al¬ 
though the S2 pockets from different groups share group- 
specific features, an investigation of the natural recognition 
sequences does not find different residue-type specificity at the 
P2 site. 

Considering the high identity shared by group 2A CoVs, 
these structural features of HCoV-HKUl M pro , together with 
corresponding enzyme activity assays, will help to profile 
HCoV-HKUl and other newly emerging etiologic agents from 
this group of CoVs. 
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